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STOCHASTIC  APPROXIMATION  WITH  DISCONTINUOUS  DYNAMICS 


and  STATE  DEPENDENT  NOISE:  W.p.l  CONVERGENCE* 


by 

Harold  J.  Xushner 


ABSTRACT 


Stochastic  approximations  of  the  form  “  ^n  ^n^'^^n'^n^ 

are  treated  where  h ( • , ♦ )  might  not  be  continuous  and  the  noise 
sequence  {5^}  might  depend  on  An  'averaging'  and  an 

'ordinary  differential  equation'  method  are  combined  to  get  w.p.l 
convergence  for  both  the  above  algorithm  emd  for  the  case  where 
the  interates  are  projected  back  onto  a  bounded  set  G  if  they 
ever  leave  it.  Two  examples  are  developed,  the  first  being  an 
automata  problem  where  the  dynamics  are  not  smooth  and  the  noise 
is  state  dependent,  and  the  second  a  Robbins-Monro  process  with 
observation  averaging  (which  causes  the  noise  to  be  state  dependent) . 
Each  example  is  typical  of  a  larger  class. 


TOXIC*  0*  TRAflMXTTArin  WSlARCH 

»•««.«  om.„ 


1.  Introduction. 

References  [1] ,  [2]  present  a  collection  of  fairly  general 
methods  for  proving  w.p.l  and  weak  convergence  results  for 
stochastic  approximations  of  the  type 

(1.1)  *  ^n  ^n  ^  Euclidean  r-space, 

where  is  a  sequence  of  random  variables  and  0  <  a^  -►  0, 

Sa^  =  <».  Also,  several  stochastic  approximation  schemes  for 
sequential  monte  carlo  function  minimization  or  equation  solving 
under  equality  and  inequality  constraints  were  dealt  with.  One, 

among  others,  is  the  projection  method.  Let  qj^ . q^^  denote 

continuously  differentiable  functions,  define  G  =  (xrq^^Cx)  £  0, 
i=l,...,m},  then  the  algorithm  is 

where  if  g(y)  denotes  the  closest  point  on  G  to  y.  Both  weak 
convergence  and  w.p.l  results  were  proved  for  this  and  several 
other  tonstrained’  algorithms. 

If  h(x,^)  is  not  additive  in  C*  then  the  methods  in  [1] 
(and  also  in  [3],  which  deals  with  related  algorithms,  at  least 
for  the  unconstrained  case)  require  that  h(*,*)  be  continuous. 

In  many  applications,  h(.,.)  is  not  continuous  (e.g.,  h(*,.) 

might  be  an  indicator  function).  Here,  we  combine  some  of  the 
basic  ideas  from  [1]  together  with  the  averaging  methods  of  [4], 


2, 

[5]  to  develop  an  alternative  method  which  is  more  convenient 
when  h(i,0  is  not  smooth,  and  which  is  often  quite  advan¬ 
tageous  if  is  state  dependent.  We  rely  on  the  assumption 

that  even  if  h(-,0  is  not  smooth,  expectations  or  conditional 
expectations  of  the  types  Eh(x,C^),  E  [h(x  1  ,5^-2 » •  •  •  1 

are  smooth  functions  of  x.  This  situation  occurs  in  many  examples. 
Reference  [6]  also  makes  such  an  assumption  for  non- smooth  h (•,•)» 
but  deals  with  a^  h  a  >  0,  and  a  finite  time  interval  [n-.an^T]. 

In  Sections  2,3,  respectively,  we  treat  the  case  (1.1),  (1.2), 
respectively,  and  where  is  bounded  and  not  state  dependent. 

Section  4  deals  with  the  case  of  state  dependent  and  the 

'unbounded 'noise  case  is  briefly  discussed.  The  convergence  is 
w.p.l  in  all  cases.  Two  interesting  classes  of  examples  appear 
in  Sections  5  and  6. 


2 .  The  algorithm  (1.1). 

Assumptions .  denotes  expection  conditioned  on  { , j  <  nl 

K  denotes  a  constant  whose  value  might  change  from  usage  to  usage 
and  (5X^  denotes 

2 

Al.  ra_  <  »,  Ea_  *  •,  {a_L-i/a_}  is  bounded,  h(*,*)  is 
n  n  n+i  n  -  — 

measurable  and  h(x,*)  is  bounded  uniformly  on  bounded 
x-sets.  is  uniformly  bounded. 

A2 .  There  is  a  twice  continuously  differentiable  Liapunov 

function  0  V(x)  such  that  is  bounded,  V(x)  •• 

as  |x|  “  and  for  some  >  0  and  compact  set  Qq  of 

the  form  {x:V(x)  <  Xq},  V^{x)K[x)  <  -  for  x  ^  Qq, 
where  E{‘)  is  defined  in  (A3). 


A3. 


3. 


There  is  a  continuously  differentiable  function  E(0 
and  a  null  set  Nq  such  that  for  each  n  and  x  and 
«  C  Nq  »  the  function  defined  by 


VgCx.n)  H 


I 

j=n 


ajVj^(x)En[h(x, 


Cj)-h(x)], 


is  bounded  by  Ka^(l+ lV^(x)h(x) | )  where  the  con- 

vergence  for  V^Cx.n)  and  for  all  infinite  sums  of  the 
"  N 

sequel  is  in  the  sense  lim  I  a.f  ]  for  each  x,  and  where  the 

N  n  ^ 

sequence  of  partial  sums  is  bounded  uniformly  on  compact  x- 
sets . 


A4.  E^|h(x,€.)|^  <  K(l+|V^(x)h(x)l),  j  in 

A5.  |V^(x)h(x)|  <  K(l+V(x)) 


A6 .  Let  [  ] ^  denote  the  gradient  here.  Then 


-  1/2 


*j^n+l  -  Ka„Cl+lv]^(x)h(x)|  ) 


A7.  For  0  _<  s  _<  1 

EnlV^(x+sajjh(x,5n))h(x+sanh(x,?^))|  1  K(l+  |  V^(x)h(x)  | ) . 
The  examples  show  that  the  assumptions  are  often  not  restrictive. 


Let  X°(.) 
which  equals 


denote  the  continuous  piecewise  linear  function 

n-1 

on  l-“,0] ,  X_,  n  >  0,  at  t  =  I  a.  and  in 
"  ~  "  i*0  ^ 

is  a  linear  interpolation  of  X^  and 


Define  X“(.)  by  X“(t)  *  X'^Ct+tj^).  Note  that  X“(0)  =  X'^Ct^)  =  X^ 
and  define  m(t)  =  max{n:t^  <  t}  for  t  >  0  and  in(t)  *  0  for 


t  <  0. 


Theorem  1.  Assume  (A1)-(A7).  Then  {3^}  is  bounded  w.p.l.  U 
Vj^(x)h(x)  <  0  for  all  x,  then  X^  {x :Vj^(x)h(x)  *  0>  w.p.l. 

In  general,  converges  w.p.l  to  the  largest  bounded  invariant 

set  of 


(2.1) 


X  =  h(x) 


H  Xq  =  x(t)  is  an  asymptotically  stable  solution  of  (2.1)  (in 
the  sense  of  Liapunov)  with  domain  of  attraction  DA(Xq) ,  and  if 
Xjj  ^  compact  A  <=  DA(Xq)  infinitely  often,  then  (except  for  oi 
in  a  null  set)  X^^  Xq  as  n  •»  ”. 


Proof.  We  have 


(2.2)  E„V(35..i)  -  V(X„)  -  a„v;(\)E„MX„.f;„) 


*  ^  (X„,C„)V^,(X„*s«X„)h(X„,C„)ds, 


Also 


5. 


(2.3) 

-  X  “j''x'VE„l'>(X„-5j)  -  h(=‘„)l 
’  -  Vx(VIE„MX„,£„)  -  h(X„)], 


which  equals 


(2.4)  last  line  of  (2.3)  + 

Xni ' Wx(>^**«V  five’s, -«j ) 

The  last  term  in  (2.4)  is  bounded  ly  O(a^)  0(1+ 1 (Xj^)h  (X^)  |  )  .  Define 
V(n)  ®  V(X^)  +  VQ(Xj^,n).  Then,  by  the  above  calculations, 

(2.5)  E„n„*l)  -  n„)  -  a„(l.a„x„)v;(X„)h(X„)  .  ^>2, 

where  sequences  of  yniformly  bounded  random  variables. 

Thus  we  can  write 


(2.6) 


rt-1 


n-1 


V(n)  -  I  a  (1  +  a.e  )v’(X.)h(X  )  -  I  ^afE  J  m .  =  M  » 
=n  +  A  1  X  1  a  i»0  ^  ^  i»0  ^  " 


n-1 


i=0 


where  (2.6)  defines  and  t  is  a  martingale.  Note  that 


6. 


n 


V(nn)  -  vcn)  -  -  e„a„. 


Define  W(n)  *  V(n)  +  E_  J  e-af  and  note  that  W(n)  >  -0(a) 

"  j=n  ^  ^ 

for  large  n  by  (A3),  (A5). 

Let  n^  be  a  stopping  time  such  that  X  C  Q«  and  define 

^  ~  "o 

n^  =  min{n:n  >  e  Qq^*  {W(n)  =  W(n  n  n^^) »  n  ^  n^}  is 

a  super  martingale  bounded  below  by  -0(aj^)»  and  E^Vf(n+l)  - 
W(n)  <  ^n  *  ^0  large.  This  implies  that 


Qq  is  a  recurrence  set;  i.e., 


\  "  Qc 


for  infinitely  many  n 


w.p.l.  Let  >  Xq  and  define  Qj^  =  (x :  V(x)  £  Xj^ } .  For  each 
such  Qj^  there  is  a  real  K(Qj^)  such  that  5 

\  ^  Ql*  Define  n2  =  minCn:  X^  C  Qj^,  n  ^  Hq). 


Then 


nz-l 


(2.7)  P{  sup  I  I  mil>e}lK(Qi)E  I  a?/e^ 
nn<n<n-  i^n^  ^i=nn 


*0-“  “2 


From  the  above  part  of  this  paragraph  and  the  fact  that  V^(x)h(x)  £“Eq 
for  X  ^ Qq  and  the  boundedness  of  lh(x,5)|,  x  €  Qj^,  we  conclude 
that  eventually  (w.p.l)  X^  stays  in  Qj^  (for  any  Xj^  >  Xq)  . 

Also, 


m>n 


(2.8a)  sup  |v(X  )-V(y  -  aj(l+a^ei)V^(xph(xp|  -V  0  w.p.l  as  n 
—  ~  i=n 


or,  equivalently,  using  m(tj^)  ■  n, 


in(t^+s)-l 

(2.8b)  sup  |v(3^(s))-V(3^'(o))  -  I  a.  (1 +a.epV^(X^)h(X^)  |  -^  0  w.p.l 
s>0  _ 

as  n  -►  <». 


7. 


Let  =  (set  of  non- recurrence  of  Qq)  u  (set  of  non-convergence 
of  By  the  w.p.l  boundedness  of  is  uniformly 

continuous  for  u)€  a  null  set  ^2*  Bi^  to  €  U  ^2  “  ^0* 
the  Arzeia-Ascoli  Theorem,  pick  a  convergent  subsequence  (con¬ 
verging  uniformly  on  bounded  intervals)  of  {X’^(-)},  with  limit 
X(0.  Then 

ft  I  _ 

(2.9)  V(X(t))  =  V(X(0))  +  V^(X(s))h(X(s))ds. 

Jo  ^ 

Equation  (2.8)  implies  that  if  V^(x)h(x)  <  0  for  all  x,  then 
X^  ->■  Sq  =  (x:  V^(x)h(x)  =  0}  w.p.l  as  n  -»•  «>. 

Next,  let  f(0  be  a  real  valued  function  on  with 

compact  support  and  continuous  second  derivatives.  With  f(*) 
replacing  V(*)>  define  fQ(x,n),f(n)  as  VQ(x,n),v(n)  were 
defined.  Then  (2.8)  holds  for  f(x)  replacing  V(0-  By  choosing  f(*) 
such  that  f(x)  =  x  ,  i=l,.,.,r,  in  the  set  vdiere  x  is  the  i  conponent  of  x, 

A 

we  see  there  is  a  bounded  sequence  {c^}  such  that 

in(tn+s)-l 

(2.10)  sup  jx^(s)-)?'(0)  -  I  a^(l+aj^ej^)h(Xp  j  -*-0  w.pj).  as  n 

s^O  i=n 

Thus  any  limit  X(*)  of  {x’'(*)}  must  satisfy  (2.1)  and  the  possible 
limit  points  of  {X^}  are  contained  w.p.l  in  the  largest  bounded 
invariant  set  of  (2.1).  The  assertion  concerning  asymptotically 
stable  x(t) =  Xq  is  now  readily  proved  (see,  e.g.,  proof  of  Theorem 
(2.3.1)  of  [1]),  and  the  details  are  omitted.  Q.E.D. 


8. 


3 .  The  Projection  Method. 

Let  G  be  as  defined  in  Section  1.  For  the  continuous 
vector  field  h(-)  define  ir(h(x))  =  projection  of  h(x)  onto 
G;  i.e.,  iir(h(x))  =  lim  [^^(x+AhCx))  -  x]/A.  The  limit  need  not 
be  unique.  We  will  need 

(A8) .  (A3)  and  (A6)  hold,  but  with  dropped  and  the  right 

sides  0(a  ). 

-  n 

(A9)  qj|^(*)»  i  “  1 . .  are  continuously  differentiable, 

G  is  bounded  and  is  the  closure  of  its  interior 
G^  =  G  -  3G  =  {x :  q^^Cx)  <  0,  i  =  l,...,m},  at  each  x  €  3G, 
the  gradients  of  the  active  constraints  are  linearly  independent. 

Theorem  2.  Assume  (Al) ,  (A8),  (A9).  Then  {X*^(*)^  is  uniform¬ 
ly  continuous  on  [0,<»].  There  is  a  null  set  such  that  for 

u)  C  any  limit  X ( • )  of  a  convergent  (uniformly  on  bounded 
intervals)  subsequence  of  { x” ( • ) }  satisfies 


(3.1) 


X  =  TT(h(x))  . 


If  compact  A  <=  DA(xg)  infinitely  often  and  u  C  flg ,  and 

Xq  =  x(t)  is  an  asymptotically  stable  point  of  (3.1),  then  ^  ^0 

Let  H(»)  > 0  be  a  real  valued  function  whose  second  mixed  partial 


derivatives  are  continuous  and  h(x)  =  -H^(x).  Define  KT  =  set 


of  points  where  h' (x)  iT(h(x))  »  0,  and  suppose  that  KT  ■  Q  S^, 


where  the  are  disjoint,  closed  and  such  that  H(x)  is 

constant  on  each  S. 


Then  Xj^  KT  w.p.l  as  n 


9. 


Proof.  The  proof  is  very  similar  to  that  of  Theorem  1.  Let 
f(*)  be  an  arbitrary  real  valued  function  on  R  with  continuous 
second  partial  derivatives.  Then 


EnfO'nn)  - 

al  fl 


where  '  <V%''t='n-«n»l/=n  '  O'”' 

Note  that  there  is  a  K  such  that  ~  ^ 

distancefX  ,3G)  >  Ka  and  that  t  lies  in  the  cone 

n  -  n  n 

■  ‘y-  ‘•i.x'Vl’y  5  “ 

Define  fQ(x,n)  by 


fg(x,n) 


I  a.f'(x)E  [h(x,5.) 
j=n  J  ^ 


I^Cx)] 


and  set  f(n)  =  +  fgCX^jn).  There  is  a  bounded  sequence 

such  that 


E„f(n+1)  -  f(n)  -  e  a;  -  a„f ’ CX„)FCX„)  -  a„f'(X„)E„T„  *  0, 
n  n  n  n  x  n  n  n  x  n  n  n 


n-1 


n-1 


n-1 


£(n)  -  f(0)  -  I  e.aj  -  I  a.f  ’  (X.)ir(X.)  -  I  a.f' (X.)x.  s 

i=0  ^  ^  ^  i=0  ^  ^  ^  ^  i 


n-1 


2  2 

where  is  ®  martingale  and  jm^l  <  Ka^.  As  in  Theorem  1, 

n>(tj^+s)-l  mCtj^+s)-l 

(3.2)  supjfCX^Cs))-  fCX>))  -  I  a-f’CXjHCXJ  -  I  a.f'(X,)T.U 
s>0’  i«n  X  X  1  1  1X111 

w.p.l  as  n 


04  II 


10. 


from  which  follows 


(3.3) 


supIx’^Cs) 

s>0 


X^'CO) 


m(tn+s)-l 


ni(tn+s)-l 


I  a.ITCX.)  -  I 
i=n  i=n 


a.x. 


w.p.l  as  n 


Also,  {X*'(*)}  is  equicontinuous,  since  h(*,*)  is  bounded. 

Let  JJq  denote  the  set  of  nonconvergence  in  (3.3)  and  for 
fixed  m  (Iq,  extract  a  convergent  subsequence  of  {x”(*)} 
(uniformly  on  bounded  intervals)  with  limit  denoted  by  X(*). 
Define  ^^(x)  =  ir(R'(x))  and  hj^(x)  =  h(x)  -  Fq  (x)  .  Then,  by 
(3.3)  there  is  a  bounded  R’^-valued  measurable  function  t(*) 
such  that  t(s)  =0  unless  X(s)  €  3G,  and  if  x(s)  €  3G  then 
t(s)  is  in  the  cone  C(X(s))  and  (3.4)  holds. 


(3.4) 


,t  ,t 

x(t)  =  X(0)  +  H'(X(s))ds  +  T(s)ds 

Jo  Jo 

ft  ft  ft 

=  x(o)  +  J^irQ(x(s))ds  +  j^irj(x(s))ds +J^TCs)ds, 


The  last  two  integrals  on  the  right  of  (3.4)  must  cancel  if  x(t) 
is  to  remain  in  G  for  all  t.  Thus  (3.1)  holds  w.p.l. 

If  E(x)  =  -H^(x),  then  use  H(*)  as  a  Liapunov  function 
for  (3.1)  to  get 


(3.5) 


H(x)  -  H^(x)?(-H^(x))  <  0, 


11. 


from  which  we  see  that  X(t)  KT  as  t  Thus,  for  each 

e  >  0,  {X^}  is  in  an  e  neighborhood  N^(KT)  of  KT  infinitely 

often  w.p.l.  Fix  c  >  0.  Define  =  lim  H(X^).  Suppose  that 

n 

and  are  such  that  =  value  of  H(x)  on  if 

<0  €  and  >  0,  and  for  some  >  e  >  0,  leaves 

the  Ej^-neighborhood  (Sj^)  infinitely  often  for  u  €  £2^^.  Then 

for  (almost  all)  ui  €  there  are  real  numbers  ^ 

k  >  Krt  >  0  with  k  t  <  **  and  a  solution  X(*)  to  (3.1) 

which  is  a  limit  of  the  sequence  {X®(A^+s),  s  Ik^,  n  =  1,2,...) 
and  where  X(0)  €  3N  (S. )  and  either  X(t)  €  3N  (S, )  if  T  <  » 

£  X  ^  1  ^ 

or  else  X(t)  3N  (S, )  as  t  Using  an  argument  like  that 

^1  ^ 

used  in  [1],  Theorem  2.3.5,  the  last  sentence  and  (3.5)  imply 

A 

that  H.  ♦  lim  H(X  )  almost  everywhere  on  £2,  ,  a  contradiction, 
i  n  ** 

The  next  to  the  last  assertion  of  the  theorem  is  proved  in  a 
similar  way.  Q.E.D. 

4 .  State  Dependent  and  Unbounded  Noise 
State  Dependent  and  Bounded  Noise 

There  cire  several  ways  in  vrtiich  the  state  dependent  and  bounded  noise  case 
can  be  treated.  The  noise  can  be  parameterized  as  in  [4] ,  Section  9.  Here,  we 
choose  a  Markovian  representation.  Suppose  that  is  a 

Markov  process.  In  applications,  this  might  require  an  augmentation 
of  the  state  space  of  the  'original’  and  a  redefinition  of 

the  'original'  h(*,*).  Let  denote  conditioning  on  5^,  j  <  n, 

j  1  snd  define  the  'partial'  transition  function 

p({,.,r|x)  -  r|x„  -  -  c). 


It  is  supposed  that  P  does  not  depend  on  n,  for  notatlonal 
simplicity  only. 

Write  in  the  form 

(4.1)  VjjCx.n)  =  V^(x)^X  aj(|h(x,5)P(Cj^_l^f j-n+l,dt|x)  -  h(x)l. 

Note  that  ^  l^n^  Markov 

property.  Assume  that  the  sum  in  (4.1)  is  continuously  differen¬ 
tiable  in  X,  and  that  the  derivatives  can  be  taken  termwise  and 
that  (replacing  A6)) 


(4.2)  I  I  a.  [v’(x)  {fh(x,e)P(5„,j-n,d5|x)  -h(x)}]  | 

j=n+l  3  X  J  n  X 


<  Ka^(l+|v^(x)h(x) 1^/^) . 


Theorem  3.  Assume  (A1)-(A7)  but  with  (4.1),  (4.2)  replacing  (A3), 
(A6) ,  resp.  and  (A4)  replaced  by 

|lh(x,C)  |^P(5^_j^,  j-n,d5|x)  <  K(l+|Vj^(x)h(x)  | )  ,  j  >  n. 

Then  the  conclusions  of  Theorem  1  hold. 

Assume  (Al) ,  (A8) ,  (A9)  but  with  the  modifications  of  (A3) , 
(A6)  stated  above.  Then  the  conclusions  of  Theorem  2  continue 


to  hold 
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Remark  on  the  proof.  In  the  proof  the  difference  (4.3)  occurs, 

i®n+i  ‘ 


Using  the  differentiability  and  the  equality  below  (4.1)  and  the 

bounds  from  (Al)  -  (A7)  (modified  for  Theorem  3),  (4.3)  can  be 

seen  to  be  of  the  order  of  a^ (1+lv' (X^)h(X^) | ) . 

n  '  X  n  n 

The  proof  of  Theorem  3  is  the  same  ais  those  of  Theorems  1  and  2. 

Unbounded  noise 


We  state  a  generalization  of  Theorem  1  for  the  case  where 
(5^)  is  unbounded.  First,  make  the  following  alterations  in 
the  assumptions.  Drop  the  boundedness  of  (C^)  in  (Al)  and 


suppose  that 

there  are 

*^0 

<  i 

•  and 

6„  1  >  0 

such  that 

sup(ES„+EY„) 
^  n  n 

n 

‘  *n®n 

n  n 

0  w. 

,  p.l  as  n  -»■  " 

and 

|h(x,5j^)|  < 

■'o'n 

X  € 

Qo 

and 

A3, 4  hold  with 

K  replaced  by 

KYjj.  An  additional  assumption  is  required.  (A6)  and  (A7)  were 

used  in  Theorem  1  to  get  the  bound  (below  (2.4))  on  (2.4).  We 

2  2 

require  that  the  bound  hold  with  replaced  by  ' 

This  is,  perhaps,  an  awkward  way  of  stating  the  assumption,  but  it 
can  be  verified  in  many  standard  examples.  For  an  alternative 
condition  see  the  remark  after  the  example.  We  now  have 
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Theorem  4.  Under  the  conditions  of  Theorem  1,  altered  as  above » 
the  conclusions  of  Theorem  1  continue  to  hold. 

The  proof  is  very  similar  to  that  of  Theorem  1;  with  only 
a  few  changes  requires;  e.g.,  *n®^n  replaced  by  a  ® 

w.p.l  and  W(n)  ^  -6^  +  0  w.p.l  as  n  There  is  an  analogous 

result  for  the  cases  of  Theorem  2. 

Example.  Let  be  stationary  and  Markov  and  h(x,S)  * 

E(x)  +  hQ(x)g(^),  where  Eg(C^)  =  0,  Eg^(5j^)  <  ".  Here  is  a 

function  of  and  is  a  function  of  1^.  Such  a  form 

n—i  n  n 

occurs  in  applications  to  the  identification  and  adaptive  control 
of  linear  systems,  where  F  and  h^  are  affine  functions  of  x. 

Then,  Theorem  1  holds  under  a  simple  stability  condition  on  x  =  F(x) , 
and  on  reasonable  conditions  on  ^  standard  and  important 

special  case  occurs  in  the  identification  problem  for  linear  systems 

A  A  ^ 

where  we  use  =  ^l^n'  ^n  ”  ^2^n'  ^^n^  Markov  and 

n+1  n  n  n  n  n  n 


♦n  ^  ^n  ^ 

Remark  on  Theorem  3.  The  * unbounded  noise*  analog  of  Theorem  3 
also  holds  under  the  conditions  of  Theorem  3,  modified  as  follows. 


(A4)  is  replaced  by  the  expression  in  the  statement  of  Theorem  3, 


but  with  K  replaced  by  (4.1)  is  used  for  VqCx^u)  and 

the  K  there  is  replaced  by  As  an  alternative  to  (A6) ,  (A7) , 

assume  that 

(4.4)  E^lleft  hand  side  of  (4.2) 1  Y„K(l+|v' (x)h(x) | ) , 
li  n  X 

where  x  is  replaced  by  x  4-  sa^h(x,C^),  s  €  [0,11,  in  evaluat¬ 
ing  (4.4).  Then  under  the  conditions  on  P^'^n'  ^0' 
the  paragraph  above  Theorem  4,  the  conclusions  of  the  first  para¬ 
graph  of  Theorem  3  continue  to  hold.  There  is  a  similar  extension 
of  the  second  paragraph  of  Theorem  3. 

The  following  two  classes  of  examples  have  state  dependent 
noise  and  they  illustrate  two  different  ways  of  using  Theorem  3. 

5.  A  Learning  Automata  Example. 

This  example  is  a  modification  of  one  in  [5] ,  where 

a  =  e  >  0  £uid  an  extensive  development  of  the  asymptotic  distri- 
n 

butional  properties  is  given.  Here  we  are  concerned  with  w.p.l 
convergence  only  for  the  case  where  a^  0.  A  relatively  simple 
case  is  treated.  Clearly,  more  complicated  arrival  and  adaptive 
processes  and  systems  can  be  treated. 

The  problem.  Calls  arrive  at  a  switching  terminal  at  random  at 
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time  instants  n  *»  0,1,2,...,  with  P{one  call  arrives  at  n^*' 
instemt}  =  y  €  (0,1),  P{>1  call  arrives  at  n^^  instant)  =»  0. 

There  are  two  possible  routings  to  the  destination,  routes  i, 
i  *  1,2,  where  route  i  has  independent  lines  -  and  can 

handle  up  to  calls  simultaneously.  Let  ln,n+l)  denote  the 

interval  of  time.  The  duration  of  each  call  has  the  distri¬ 
bution:  P{call  completed  in  the  (n+1)®^  interval | uncompleted 
at  end  of  n^^  interval,  route  i  used)  =  €  (0,1).  The 

members  of  the  sequence  of  interarrival  times  and  call  durations 
are  mutually  independent.  The  use  of  an  adaptive  automaton  for 
adjusting  the  routing  comes  from  17] . 

The  routing  automaton  operates  as  follows.  Let  {X  )  denote 

n 

a  sequence  of  random  variables  -  with  values  in  [0,1].  In  order 
to  have  an  unambiguous  sequencing  of  events,  let  the  calls  ending 
in  the  n^^  interval  actually  end  at  time  n  +  y,  and  let  both 
arrivals  and  route  assignments  be  at  the  ends  of  the  intervals; 
i.e.,  at  the  instants  0,1,2,...  precisely.  Thus  the  state  of 
the  route  occupancy  at  time  (n+1)”  does  not  include  the  calls  just 
terminated  or  calls  arriving  at  (n+1) .  Define  the  "route  occupancy 
process"  =  (zi,zf),  where  Z^  is  the  number  of  lines  of  route 
i  occupied  at  time  n'*’.  Thus,  £  N^.  If  a  call  arrives  at 
instant  n+1,  the  automaton  chooses  route  1  with  probability  X^ 
and  route  2  with  probability  1  -  X^.  If  all  lines  of  the  chosen 
route  1  are  occupied  at  instant  (n+l)~,  then  the  call  is  switched 
to  route  j  (j  i) .  If  all  lines  of  route  j  are  also  occupied 
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at  instant  (n+l)",  then  the  call  is  rejected.  The  choice 


probabilities  be  adjusted  or  adapted  according  to 


the  'experience'  of  the  system. 

The  si)eci£ic  adjustment  s 
"linear- reward"  algorithm  t71 .  Let  J 


The  si)eci£ic  adjustment  scheme  for  bhe  following 


in 


denote  the  indicator 


of  the  event  {call  arrives  at  n  +  1,  is  assigned  first  to  route 
i  and  is  adcepted  by  route  i).  For  practical  as  well  as  theoretical 
purposes,  it  is  important  to  bound  away  from  the  points  0  and 

1.  Let  0  <  X,  <  X  <1.  We  use  the  (projected)  algorithm  (5.1), 

X  *■  u 

*11 

where  denotes  truncation  at  x^  or  x^^,  and 

ot(x)  =  1  -  X,  6  (x)  =  -X. 


(5.1) 


u 
lx  I 


Some  definitions.  If  the  choice  probabilities  X  are  held  fixed 
-  n 

at  some  value  x  for  all  n,  then  the  route  choice  automaton  still 
is  well  defined.  For  fixed  route  selection  probability  x  €  (0,1), 
let  ®  t  (^j^Cx)  ,Z^Xx)) ,  0  5  n  <  »}  denote  the  corresponding 

route  occupancy  process.  For  the  process  {Z^Cx))»  the  state  space 

Z  *  {(itj):  i  i  j  <  N2}  (whose  points  are  ordered  in  some 
fixed  way)  is  a  single  ergodic  class,  and  the  probability  transition 
matrix,  denoted  by  A*(x),  has  infinitely  differentiable  components. 
With  given  initial  condition,  define  P^^C^lx)  ■  P{Z^(x)  «  a}  and  define 
the  vector  P^Cx)  -  {P„(»|x),  a  €  Z}.  Then  P^+jCx)  -  A(x)Pjj(x). 
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The  pair  {Z„,X  ),  n  >  0}  is  a  Markov  process  on  1  [x  x  ] 
and  the  marginal  transition  probability  =  (k,Jl)|Z^  *  (i,j),Xj^} 

is  just  the  ((i,j)-column,  (k.tj-row)  entry  of 

vector  Pjj  =  ^  ^here  Pj^(a)  =  P{Z^  =  cx|x^,  t  <  n,  Zq}. 

Then  P^^j^  =  A(X^)Pj^.  Also,  let  P(x)  =  {P(a|x),  a  €  Z}  denote  the 
unique  invariant  measure  for  {Z^^Cx)},  with  marginal  defined  by 
P^(Nj^|x)  =  asymptotic  probability  that  Z^  =  Nj ,  and  similarly  for 
route  2.  Finally,  define  the  transition  probability  PC“,j,®j^l^)  = 
P{^j(x;  =  define  the  marginal  transition 

probability 


P^(«,j,N.lx)  =  P{zj(x)  *  NJZq(x)  =  a}. 

Define  E  to  be  the  expectation  conditioned  on  {Z^^,  Xj^,  1  <  n} 
"  •  N. 

and  set  =  (1-^2^) 

Application  of  Theorem  3. 

We  have  *  **^^n^'^ln  *  ^^^n^'^Zn  with  !{•} 

denoting  the  indicator  function, 

+  W6(X„)(1-Xj^)[1-V2  I{z2  -  N2}1, 
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which  can  be  written  in  the  form 

(5.2)  =  MX^(1-X^)[V2P2(Z^.0,N2|X^)  -  (Z^^.O  ,N  J  X^^)  ] . 

Define  ITCO  to  be  the  limit 

h(x)  =  yx(l-x)lim  E[V2P^ (Z^,n,N2 1 x)  -  v^p^ (Z^ ,n,N2 | x) ] 

,  n-*" 

(5.3) 

=  ux  (l-x)[V2p2(N2|x)  -  v^pl(Njx)]. 

The  sum  (A3)  is  replaced  by  (since  the  second  part  of  Theorem 
3  is  to  be  used,  the  V^(x)  component  can  be  dropped) 


00 

VQ(x,n)  =  Mx(l-x)V^(x)  I  [v^CP^Cx, j-n,N2|x)  -  P^(N2|x)) 


(5.4) 


-  v^(pl(x,j-n,Njx)  -  P^(Njx))]. 


The  sum  (A6)  is  replaced  by  the  analogous  sum  of  the  derivatives 
(again  drop  the  V^(x)  component).  There  is  a  unique  x  e  (0,1) 
such  that  F(x)  =  0  and  ir(x)  >  0  for  x  6  (0,x)  and  h(x)  <  0 
for  X  e  (ir,l).  The  P„(x)  and  P„  „(x)  converge  [5]  to  the 
limits  P(x),  Pjj(x)  geometrically  with  a  rate  uniform  in 
X  €  and  in  Po(x)  =  0  is  the  appropriate 

initial  condition  to  get  the  limit  for  the  derivative  sequence  in 
(A6)).  This  result  implies  that  (A3),  (A6)  exist  and  converge 
absolutely  and  uniformly  in  (n,X^)  at  a  geometric  rate.  See 
[5]  for  the  details  of  the  convergences. 


Part  2  of  Theorem  3  now  yields  Theorem  4  below.  Theorem  4 
can  also  be  proved  directly,  via  the  method  of  Theorem  2  (here 


the  boundary  is  only  {x,,x  ))  with  the  'corrected'  test  function 
(5.4)  used  in  lieu  of  the  sum  in  (A3). 


■  2  — 

Theorem  5.  Let  Ea^  <  Ea.  =  «.  Then  if  x  €  [x„,x..l,  we 

'■  '  ■  1  X  — U  " 

have  X  w.p.l.  Othervise  converges  w.p.l  to  the 

point  x^  or  x^  which  is  nearest  to  x. 

6.  Observation  Averaging  for  Stochastic  Approximations. 

The  general  method  of  Theorems  1  and  2  can  be  easily  used  to 
prove  w.p.l  convergence  for  stochastic  approximations  of  the 
Robbins-Monro  or  Kiefer-Wolfowitz  type  but  with  averaged  observa¬ 
tions.  The  main  difficulty  is  due  to  the  fact  that  the  quantity 
which  plays  the  role  of  the  noise  is  always  state  dependent.  The 
idea  will  be  illustrated  via  a  very  simple  example.  We  use  a 
Robbins-Monro  scheme  to  estimate  the  root  of  Kx  =  0,  x  =*  scalar, 
K  >  0  (but  the  method  is  applicable  to  the  general  problem) . 
Define  the  estimates  by 
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where  a  £  (0,1),  B  >  0  and  is  a  bounded  sequence  of 

mutually  independent  random  variables  with  zero  mean  value.  If 
a  =  0,  then  (6.1)  is  the  usual  Robbins-Monroe  method,  truncated 
at  values  If  o  €  (0,1),  then  the  observations  are 

exponentially  weighted.  Theorem  3  requires  truncation  to  some 
finite  interval  lx.,x  ].  Such  truncation  is  usually  done  in 
practice  anyway.  Define  E{x)  =-eKx/(l-a)  and  h(x,C)  =  C. 
Instead  of  writing  VQ(x,n)  in  the  form  (4.1),  it  is  more 
convenient  to  do  the  following.  For  each  x,n,  define  the 
auxiliary  processes  {Cj(x),  j  ^  n)  where  the  initial  condition 
Cj^_j^(x)  is  to  be  defined  and  ^^(x)  =  a5j_2^(x)  -  (BKx+ipj), 
j  >  n.  Write  VQ(x,n)  as 


(6.2)  V-(x,n)  =  I  a.V  (x)E  [h(x,C^(x))-h(x)], 

^  j=n  J  *  "  J 

where  , (X„)  =  , ,  and  E  denotes  expectation  conditioned 

n-i  n  n— i  n 

on  i  ^  n,  i|i^,  i  <  n.  Note  that  ^n^^n^  ”  ^n* 


Now  Theorem  3  yields 


Theorem  6.  Let  laj  <  »,  =  ».  It  0  6  then 

t^n)  0  w.p.l.  Otherwise  converges  w.p.l  to  the  point 

which  is  closest  to  zero. 


In  [4]  there  is  an  analysis  of  the  asymptotic  properties 
of  (6.1)  when  a. 


n 


e  >  0 
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